832 research outputs found

    Accelerating human-in-the-loop machine learning

    Get PDF
    Machine learning workflow development is a process of trial-and-error: developers iterate on workflows by testing out small modifications until the desired accuracy is achieved. Unfortunately, existing machine learning systems focus narrowly on model training—a small fraction of the overall development time—and neglect to address iterative development. We propose Helix, a machine learning system that optimizes the execution across iterations—intelligently caching and reusing, or recomputing intermediates as appropriate. Helix captures a wide variety of application needs within its Scala DSL, with succinct syntax defining unified processes for data preprocessing, model specification, and learning. We demonstrate that the reuse problem can be cast as a Max-Flow problem, while the caching problem is NP-Hard. We develop effective lightweight heuristics for the latter. Empirical evaluation shows that Helix is not only able to handle a wide variety of use cases in one unified workflow but also much faster, providing run time reductions of up to 19× over state-of-the-art systems, such as DeepDive or KeystoneML, on four real-world applications in natural language processing, computer vision, social and natural sciences

    Parallel Computation Using Active Self-assembly

    Get PDF
    We study the computational complexity of the recently proposed nubots model of molecular-scale self-assembly. The model generalizes asynchronous cellular automaton to have non-local movement where large assemblies of molecules can be moved around, analogous to millions of molecular motors in animal muscle effecting the rapid movement of large arms and legs. We show that nubots is capable of simulating Boolean circuits of polylogarithmic depth and polynomial size, in only polylogarithmic expected time. In computational complexity terms, any problem from the complexity class NC is solved in polylogarithmic expected time on nubots that use a polynomial amount of workspace. Along the way, we give fast parallel algorithms for a number of problems including line growth, sorting, Boolean matrix multiplication and space-bounded Turing machine simulation, all using a constant number of nubot states (monomer types). Circuit depth is a well-studied notion of parallel time, and our result implies that nubots is a highly parallel model of computation in a formal sense. Thus, adding a movement primitive to an asynchronous non-deterministic cellular automation, as in nubots, drastically increases its parallel processing abilities
    • …
    corecore